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ABSTRACT 

While the novel Middle East Respiratory Syndrome Coronavirus (MERS-CoV) is closely 
related to Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and Pipistrellus bat CoV 
HKUS (Pi-BatCoV HKUS) in bats from Hong Kong, and other potential lineage C 
betacoronaviruses in bats from Africa, Europe and America, its animal origin remains 
obscure. To better understand the role of bats in its origin, we examined the molecular 
epidemiology and evolution of lineage C betacoronaviruses among bats. Ty-BatCoV 
HKU4 and Pi-BatCoV HKUS were detected in 29% and 25% of alimentary samples from 
lesser bamboo bat (Tylonycteris pachypus) and Japanese pipistrelle (Pipistrellus abramus) 
respectively. Sequencing of their RdRp, S and N genes revealed that MERS-CoV is more 
closely related to Pi-BatCoV HKU5 in RdRp (92.1-92.3% aa identities) but to Ty- 
BatCoV HKU4 in S (66.8-67.4% aa identities) and N (71.9-72.3% aa identities). 
Although both viruses were under purifying selection, the S of Pi-BatCoV HKUS5 
displayed marked sequence polymorphisms and more positively selected sites than that of 
Ty-BatCoV HKU4, suggesting that Pi-BatCoV HKUS5 may generate variants to occupy 
new ecological niches along with its host which faces diverse habitats. Molecular clock 
analysis showed that they diverged from a common ancestor with MERS-CoV at least 
several centuries ago. Although MERS-CoV may have diverged from potential lineage C 
betacoronaviruses in European bats more recently, these bat viruses were unlikely the 
direct ancestor of MERS-CoV. Intensive surveillance for lineage C betaCoVs in 
Pipistrellus and related bats with diverse habitats, and other animals from the Middle 


East may fill the evolutionary gap. 
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INTRODUCTION 

Coronaviruses (CoVs) infect humans and a wide variety of animals, causing respiratory, 
enteric, hepatic and neurological diseases of varying severity. They have been classified 
traditionally into groups 1, 2 and 3, based on genotypic and serological characteristics (1, 
2). Recently, the nomenclature and taxonomy of CoVs have been revised by the 
Coronavirus Study Group of the International Committee for Taxonomy of Viruses 
(ICTV). They are now classified into three genera, Alphacoronavirus, Betacoronavirus 
and Gammacoronavirus, replacing the three traditional groups (3). Novel CoVs, which 
represented a novel genus, Deltacoronavirus, have also been identified (4, 5). While 
CoVs from all four genera can be found in mammals, bat CoVs are likely the gene source 
of Alphacoronavirus and Betacoronavirus, and avian CoVs are the gene source of 
Gammacoronavirus and Deltacoronavirus (5-7). 

CoVs are well known for their high frequency of recombination and mutation 
rates, which may allow them to adapt to new hosts and ecological niches (1, 8-12). This 
is best exemplified by the severe acute respiratory syndrome (SARS) epidemic, which 
was caused by SARS CoV (13, 14). The virus has been shown to be originated from 
animals, with horseshoe bats as the natural reservoir and palm civet as the intermediate 
host allowing animal-to-human transmission (15-18). Since the SARS epidemic, many 
other novel CoVs in both humans and animals have been discovered (4, 7, 19-24). In 
particular, a previously unknown diversity of CoVs have been described in bats from 
China and other countries, suggesting that bats are important reservoirs of alphaCoVs and 


betaCoVs (16, 18, 25-32). 
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In September 2012, two cases of severe community-acquired pneumonia were 
reported in Saudi Arabia, which were subsequently found to be caused by a novel CoV, 
Middle East Respiratory Syndrome Coronavirus (MERS-CoV), previously known as 
human betaCoV 2c EMC/2012 (33, 34, 35). As of May 2013, a total of 40 laboratory 
confirmed cases of MERS-CoV infection have been reported with 20 deaths (36), giving 
a crude fatality rate of 50%. So far, most cases of MERS-CoV infection presented with 
severe acute respiratory illness (36, 37). A macaque model for MERS-CoV infection has 
also been established, which showed that the virus caused localized-to-widespread 
pneumonia in all infected animals (38). The viral virulence may be related to the ability 
of MERS-CoV to evade the innate immunity with attenuated interferon-B response (39- 
41). Moreover, the ability to cause human-to-human transmission has raised the 
possibility of another SARS-like epidemic (36, 37). However, the source of this novel 
CoV is still obscure, which has hindered public health and infection control strategies for 
disease prevention. Phylogenetically, MERS-CoV belongs to Betacoronavirus lineage C, 
being closely related to Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and 
Pipistrellus bat CoV HKUS5 (Pi-BatCoV HKUS) previously discovered in lesser bamboo 
bat (Tylonycteris pachypus) and Japanese pipistrelle (Pipistrellus abramus) in Hong 
Kong, China respectively (31, 32, 42, 43). Moreover, potential viruses with partial gene 
sequences closely related to MERS-CoV have also been detected in bats from Africa, 
Europe and America, although complete genome sequences were not available (44, 45). 
MERS-CoV is able to infect various mammalian cell lines including primate, porcine, bat 
and rabbit cells, which may be explained by the use of the evolutionarily conserved 


dipeptidyl peptidase 4 (DPP4) as its functional receptor (46, 47). These suggested that 
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MERS-CoV may possess broad species tropism and have emerged from animals. 
However, the direct ancestor virus and animal reservoir of MERS-CoV is yet to be 
identified. 

To better understand the evolutionary origin of MERS-CoV and the possible role 
of bats as the reservoir for its ancestral viruses, studies on the genetic diversity and 
evolution of lineage C betaCoVs in bats would be important. We attempted to study the 
epidemiology of lineage C betaCoVs, including Ty-BatCoV HKU4 and Pi-BatCoV 
HKUS, among various bat species in Hong Kong, China. The complete RNA-dependent 
RNA polymerase (RdRp), spike (S) and nucleocapsid (N) genes of 13 Ty-BatCoV HKU4 
and 15 Pi-BatCoV HKUS strains were sequenced to assess their genetic diversity and 
evolution. The results revealed that the two viruses were stably evolving in their 
respective hosts, and have diverged from their common ancestor long time ago. However, 
the S protein of Pi-BatCoV HKUS5 exhibited marked sequence divergence and much 
more positively selected sites than that of Ty-BatCoV HKU4, which may suggest the 
ability of Pi-BatCoV HKUS along with its host to occupy new ecological niches. The 


potential implications on the animal origin of MERS-CoV were also discussed. 
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METHODS 

Collection of bat samples. Various bat species were captured from different locations in 
Hong Kong, China over a 7-year period (April 2005 to August 2012). Their respiratory 
and alimentary specimens were collected using procedures described previously (16, 48). 
To prevent cross contamination, specimens were collected using disposable swabs with 
protective gloves changed between samples. All specimens were immediately placed in 
viral transport medium containing Earle's balanced salt solution (Invitrogen, New York, 
United States), 20% glucose, 4.4% NaHCO3, 5% bovine albumin, 50000 ug/ml 
vancomycin, 50000 ug/ml amikacin, 10000 units/ml nystatin, before transportation to the 
laboratory for RNA extraction. 

RNA extraction. Viral RNA was extracted from the respiratory and alimentary 
specimens using QIAamp Viral RNA Mini Kit (QIAgen, Hilden, Germany). The RNA 
was eluted in 50 wl of AVE buffer (QIAgen) and was used as the template for RT-PCR. 

RT-PCR for CoVs and DNA sequencing. CoV detection was performed by 
amplifying a 440-bp fragment of the RdRp gene of CoVs using conserved primers (5’- 
GGTTGGGACTATCCTAAGTGTGA-3’ and 5’- 
CCATCATCAGATAGAATCATCATA-3’) designed by multiple alignments of the 
nucleotide sequences of available RdRp genes of known CoVs as described previously 
(17, 24). Reverse transcription was performed using the SuperScript III kit (Invitrogen, 
San Diego, CA, USA). The PCR mixture (25 ul) contained cDNA, PCR buffer (10 mM 
Tris-HCl pH 8.3, 50 mM KCI, 3 mM MgCl and 0.01% gelatin), 200 UM of each dNTPs 
and 1.0 U Taq polymerase (Applied Biosystem, Foster City, CA, USA). The mixtures 


were amplified in 60 cycles of 94°C for 1 min, 48°C for 1 min and 72°C for 1 min and a 
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final extension at 72°C for 10 min in an automated thermal cycler (Applied Biosystem, 
Foster City, CA, USA). Standard precautions were taken to avoid PCR contamination 
and no false-positive was observed in negative controls. 

The PCR products were gel-purified using the QIAquick gel extraction kit 
(QIAgen, Hilden, Germany). Both strands of the PCR products were sequenced twice 
with an ABI Prism 3700 DNA Analyzer (Applied Biosystems, Foster City, CA, USA), 
using the two PCR primers. The sequences of the PCR products were compared with 
known sequences of the RdRp genes of CoVs in the GenBank database to identify 
lineage C betaCoVs. 

Sequencing and analysis of the complete RdRp, S and N genes of Ty-BatCoV 
HKU4 and Pi-BatCoV HKUS strains. To study the genetic diversity and evolution of 
Ty-BatCoV HKU4 and Pi-BatCoV HKUS detected in bats, the complete RdRp, S and N 
genes of 13 Ty-BatCoV HKU4 strains and 15 Pi-BatCoV HKUS strains detected at 
different time and/or place, in addition to the nine previous strains with complete genome 
sequences, were amplified and sequenced using primers designed according to available 
genome sequences (Table 1) (32). The sequences of the PCR products were assembled 
manually to produce the complete RdRp, S and N gene sequences. Multiple sequence 
alignments were constructed using MUSCLE in MEGA version 5 (49, 50). Phylogenetic 
trees were constructed using Maximum-likelihood method (51), with bootstrap values 
calculated from 100 trees. Protein family analysis was performed using PFAM and 
InterProScan (52, 53). Prediction of transmembrane domains was performed using 
TMHMM (54). The heptad repeat (HR) regions were predicted by using the coiled-coil 


prediction program MultiCoil2 (55). 
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Estimation of synonymous and non-synonymous substitution rates. The 
number of synonymous substitutions per synonymous site, Ks, and the number of non- 
synonymous substitutions per non-synonymous site, Ka, for each coding region were 
calculated using the Nei-Gojobori method (Jukes-Cantor) in MEGA version 5 (50). 

Detection of positive selection. Sites under positive selection in the S gene in Ty- 
BatCoV-HKU4 and Pi-BatCoV-HKUS were inferred using single-likelihood ancestor 
counting (SLAC), fixed effects likelihood (FEL) and random effects likelihood (REL) 


methods as implemented in DataMonkey server (http:/Awww.datamonkey.org) (56). 


Positive selection for a site was considered to be statistically significant if the P-value 
was <0.1 for SLAC and FEL methods or posterior probability was =90% level for REL 
method. A mixed-effects model of evolution (MEME) was further used to identify 
positively selected sites under episodic diversifying selection in particular positions in 
sublineages within a phylogenetic tree even when positive selection is not evident across 
the entire tree (57). Positively selected sites with a P-value <0.05 were reported. 
Estimation of divergence time. As RdRp and N genes are relatively conserved 
across CoVs and therefore most likely reflect viral phylogeny, divergence time was 
calculated using complete RdRp and N gene sequence data of Ty-BatCoV HKU4, Pi- 
BatCoV HKUS5 and MERS-CoV strains, and 904-bp partial RdRp sequence data of 
lineage C betaCoVs from European bats, with Bayesian Markov Chain Monte Carlo 
(MCMC) approach as implemented in BEAST (Version 1.7.4) as described previously (9, 
17, 21, 44, 58, 59). One parametric model (Constant Size) and one non-parametric model 
(Bayesian Skyline with five groups) tree priors were used for the inference. Analyses 


were performed under Hasegawa-Kishino-Yano (HKY) model with coding sequence 
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partitioned into Ist + 2nd versus 3rd positions and rate variation between sites described 
by a four-category discrete gamma distribution using both strict and relaxed [uncorrelated 
lognormal (Ucld) and uncorrelated exponential (Uced)] molecular clocks. MCMC run 
was 2x 10% steps long, sampling every 1,000 steps. Convergence was assessed on the 
basis of the effective sampling size after a 10% burn-in using Tracer software Version 1.5 
(58). The mean time of the most recent common ancestor ((MRCA) and the highest 
posterior density regions at 95% (HPD) were calculated, and the best-fitting model was 
selected by a Bayes factor, using marginal likelihoods implemented in Tracer (60). 
Bayesian Skyline under a relaxed clock model with Uced was adopted for making 
inferences, as this model fitted the data better than other models tested by Bayes factor 
analysis (data not shown) and allowed variations in substitution rates among lineages. All 
trees were summarized in a target tree by the Tree Annotator program included in the 
BEAST package by choosing the tree with the maximum sum of posterior probabilities 
(maximum clade credibility) after a 10% burn-in. 

Nucleotide sequence accession numbers. The nucleotide sequences of the 
complete RdRp, S and N genes of Ty-BatCoV HKU4 and Pi-BatCoV HKUS5 have been 
lodged within the GenBank sequence database under accession no. KC522036 to 


KC522119. 
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RESULTS 

Detection of Ty-BatCoV HKU4 and Pi-BatCoV HKU5 from bat samples. A total of 
5426 respiratory and 5260 alimentary specimens from 5481 bats of 21 different species 
were obtained. RT-PCR for a 440-bp fragment in the RdRp genes of CoVs detected the 
presence of lineage C betaCoVs from two bat species, including Ty-BatCoV HKU4 in 29 
(29%) of 99 alimentary samples from lesser bamboo bat (Tylonycteris pachypus) and Pi- 
BatCoV HKUS in 55 (25%) of 216 alimentary samples from Japanese pipistrelle 
(Pipistrellus abramus) respectively (Table 2). None of the respiratory samples were 
positive for lineage C betaCoVs. Bats positive for Ty-BatCoV HKU4 and Pi-BatCoV 
HKUS5 were from seven and 13 sampling locations in Hong Kong respectively. No 
obvious disease was observed in bats positive for Ty-BatCoV HKU4 and Pi-BatCoV 
HKUS. Ty-BatCoV HKU4 was found only in adult bats while Pi-BatCoV HKUS was 
found in both adult and juvenile bats. 

Complete RdRp, S and N gene analysis of Ty-BatCoV HKU4 and Pi-BatCoV 
HKUS strains. To study the genetic diversity and evolution of lineage C betaCoVs in 
bats, the complete RdRp, S and N gene sequences of 13 Ty-BatCoV HKU4 strains and 15 
Pi-BatCoV HKUS strains were sequenced. Comparison of the deduced aa sequences of 
the RdRp, S and N genes of Ty-BatCoV HKU4 and Pi-BatCoV HKUS5 to those of 
MERS-CoV showed that MERS-CoV is more closely related to Pi-BatCoV HKUS than 
to Ty-BatCoV HKU4 (92.1-92.3% versus 89.6-90% identities) in the RdRp gene, but 
more closely related to Ty-BatCoV HKU4 than to Pi-BatCoV HKUS in the S (66.8- 
67.4% versus 63.4-64.5% identities) and N (71.9-72.3% versus 69.5-70.5% identities) 


genes (Table 3). Moreover, MERS-CoV is more closely related to Ty-BatCoV HKU4 and 
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Pi-BatCoV HKUS belonging to Betacoronavirus lineage C than to CoVs belonging to 
Betacoronavirus lineages A, B and D (Table 3). Phylogenetic analysis of the complete 
RdRp, S and N gene sequences of Ty-BatCoV HKU4 and Pi-BatCoV HKUS showed that 
the sequences from the 13 Ty-BatCoV HKU4 strains and 15 Pi-BatCoV HKUS strains 
formed two distinct clusters in all three genes, being closely related to each other and to 
MERS-CoV (Fig. 1). Interestingly, unlike the S genes of the 13 Ty-BatCoV HKU4 
strains which shared highly similar sequences with very short branch lengths, the S genes 
of Pi-BatCoV HKUS displayed marked sequence polymorphisms among the 15 strains, 
with up to 14% nucleotide and 12% amino acid (aa) differences. 

The S proteins of Ty-BatCoV HKU4 and Pi-BatCoV HKUS encoded 1350-1352 
and 1352-1359 aa respectively. A potential cleavage site, though not perfectly conserved, 
could be present in the S proteins of Ty-BatCoV HKU4 (S[TM]FR) and Pi-BatCoV 
HKUS5 (R[VFL][ALR]R). InterProScan analysis predicted them as type I membrane 
glycoproteins, with most of the protein (residues 18/21/22 to 1294/1296/1297 for Ty- 
BatCoV HKU4 and residues 22 to 1296/1297/1298/1301/1302/1303 for Pi-BatCoV 
HKUS) exposed on the outside of the virus, a transmembrane domain (residues 
1295/1297/1298 to  1317/1319/1320 for Ty-BatCoV HKU4 nd _ residues 
1297/1298/1299/1302/1303/1304 to 1319/1320/1321/1324/1325/1326 for Pi-BatCoV 
HKUS ) at the C terminus, followed by a cytoplasmic tail rich in cysteine residues. Two 
heptad repeats (HR), important for membrane fusion and viral entry (61), were located at 
residues 978/980 to 1124/1126 (HR1) and 1251/1253 to 1285/1287 (HR2) for Ty- 
BatCoV HKU4, and residues 978/979/983/984 to 1124/1125/1129/1130 (HR1) and 


1253/1254/1258/1259 to 1287/1288/1292/1293 (HR2) for Pi-BatCoV HKUS5. All 
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cysteine residues are conserved between the S of Ty-BatCoV HKU4, Pi-BatCoV HKU5 
and MERS-CoV. While CoVs are known to utilize a variety of host receptors for cell 
entry, a number of closely related as well as distantly related CoVs may utilize the same 
receptor. For example, aminopeptidase N (CD13) has been shown to be the receptor for 
various alphaCoVs including HCoV 229E, canine CoV (CCoV), feline infectious 
peritonitis virus (FIPV), porcine epidemic diarrhea coronavirus (PEDV) and 
transmissible gastroenteritis coronavirus (TGEV) (62, 63). Moreover, human angiotensin- 
converting enzyme 2 (hACE2) has been found to be the receptor for both HCoV NL63, 
an alphaCoV, as well as SARS CoV, a betaCoV, although they utilize different receptor- 
binding sites (64, 65). As for lineage A betaCoVs, HCoV OC43 and the closely related, 
bovine CoV utilize N-acetyl-9-O acetyl neuramic acid as receptor, whereas 
carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) is the receptor 
for mouse hepatitis virus (MHV) (66-70). The S proteins of Ty-BatCoV HKU4 and Pi- 
BatCoV HKUS as well as MERS-CoV did not exhibit significant sequence homology to 
the known RBDs of other CoVs including the betaCoVs such as SARS CoV and HCoV 
OC43 (71-78). Recently, DPP4 has been identified as a functional receptor for MERS- 
CoV, although the exact receptor-binding domain is still unknown (47, 79). Based on the 
X-ray crystal structure of the RBD domain in the SARS CoV S protein, residues 377 to 
662 have been predicted as a possible RBD for MERS-CoV (80). Using the same 
methodology, residues 387 to 587 in Ty-BatCoV HKU4 S protein and residues 389 to 
580 Pi-BatCoV HKUS S protein were predicted as their possible RBDs. However, further 
studies are required to elucidate the receptors for Ty-BatCoV HKU4 and Pi-BatCoV 


HKUS5 and their RBDs. 
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Estimation of synonymous and non-synonymous substitution rates. In line 
with phylogenetic analysis, multiple alignment of the S gene sequences showed that Pi- 
BatCoV HKUS possessed more synonymous and non-synonymous substitutions than Ty- 
BatCoV HKU4 (Table 4). Compared to Ty-BatCoV HKU4 in which 58 aa positions 
contained substitutions, 253 aa positions in Pi-BatCoV HKUS5 contained substitutions 
among which >2 aa were encoded at 67 aa positions (Fig. 2 and 3). The Ka/Ks ratios for 
the RdRp, S and N genes among different strains of Ty-BatCoV HKU4 and Pi-BatCoV 
HKUS were determined (Table 4). The Ka/Ks ratios were generally low, although the S 
genes of both viruses showed relatively higher ratios (0.118) compared to RdRp and N 
genes. This suggested that these genes were under purifying selection. Nevertheless, the 
Ka and Ks of the S genes of Pi-BatCoV HKUS5 were relatively high compared to those of 
Ty-BatCoV HKU4, which reflected the marked sequence polymorphisms among 
different strains. 

Detection of positive selection in S genes. The S genes of Pi-BatCoV HKU5 
possessed more positively selected sites than the S genes of Ty-BatCoV HKU4 (Fig. 4). 
Only two and five aa positions in Ty-BatCoV HKU4 were found to be under positive 
selection using REL and MEME methods respectively, whereas no significant positive 
selection was identified by SLAC and FEL methods. In contrast, two, 12, 27 and 43 aa 
positions in Pi-BatCoV HKU5 were found to be under positive selection using SLAC, 
FEL, REL and MEME methods respectively. Most of these sites were distributed within 
the S1 domain, indicating that this domain may have been under functional constraints. 

Estimation of divergence time. To estimate the divergence time of Ty-BatCoV 


HKU4, Pi-BatCoV HKUS5 and MERS-CoV strains, their complete RdRp and N gene 
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sequences were subject to molecular clock analysis using the relaxed clock model with 
Uced. Using complete RdRp gene sequences, tMRCA of MERS-CoV and Pi-BatCoV 
HKUS was estimated at 1520.09 (HPDs, 745.73 to 1956.12) (Fig. 5A). Using complete N 
gene sequences, tMRCA of MERS-CoV, Ty-BatCoV HKU4 and Pi-BatCoV HKUS5 was 
estimated at 1323.51 (HPDs, 383.58 to 1897.75) (Fig. 5B). Since partial RdRp gene 
sequences closely related to the corresponding sequence of MERS-CoV have recently 
been detected in European bats, molecular clock analysis was also performed to estimate 
their divergence time. Using the 904-bp partial RdRp sequences, tMRCA of MERS-CoV 
and three European bat CoV strains (BtCoV 8-691, BtCoV 8-724 and BtCoV UKR-G17) 
was estimated at 1859.32 (HPDs, 1636.67 to 1987.55) (Fig. 5C). The estimated mean 
substitution rate of the complete RdRp and N gene, and partial RdRp sequence data set 
was 5.1210“, 8.642x10~ and 7.407x10~ substitution per site per year, comparable to 


that observed in other CoVs (9, 17, 59, 81, 82). 
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DISCUSSION 

In this study, Ty-BatCoV HKU4 and Pi-BatCoV HKUS5 were found to be highly 
prevalent among lesser bamboo bat and Japanese pipistrelle in Hong Kong respectively, 
with detection rates of 25-29% in their alimentary samples. In line with previous studies, 
MERS-CoV is closely related to Betacoronavirus lineage C than to lineages A, B and D 
in the RdRp, S and N genes (34, 42, 43). Nevertheless, the genetic distance between 
MERS-CoV and the various strains of Ty-BatCoV HKU4 and Pi-BatCoV HKUS was still 
large, with their S proteins having <67.4% aa identities. Two recent studies have 
identified partial gene sequences closely related to MERS-CoV in bats from Africa, 
Europe and America, suggesting that lineage C betaCoVs are distributed in bats 
worldwide (44, 45). In one study, CoVs related to MERS-CoV were detected in 46 
(24.9%) Nycteris bats and 40 (14.7%) Pipistrellus bats from Ghana and Europe using RT- 
PCR targeting a 398-bp fragment of the RdRp gene (44). The extended 904-bp RdRp 
sequences of three strains from Romania and Ukraine showed that they shared 87.7- 
88.1% nucleotide and 98.3% amino acid identities to MERS-CoV, compared to 80.3- 
82%/82.4-83.7% nucleotide and 92-92.4%/94-94.4% amino acid identities between Ty- 
BatCoV HKU4/Pi-BatCoV HKUS and MERS-CoV respectively in the corresponding 
regions. In another study, screening of 606 bats from Mexico showed the presence of a 
betaCoV also closely related MERS-CoV in a Nyctinomops lacticaudatus bat (45). 
Although the authors claimed the use of a 329-bp fragment of the RdRp gene for RT- 
PCR and sequence analysis, the available sequence was in fact within nsp14. Analysis of 
this partial nsp14 sequence showed that it shared 85.7% nucleotide and 95.5% amino acid 


identities to MERS-CoV (45), compared to to 81.9%/83.4-84.2% nucleotide and 
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88.6%/92% amino acid identities differences between Ty-BatCoV HUK4/Pi-BatCoV 
HKUS5 and MERS-CoV respectively in the corresponding regions. However, complete 
gene sequences were not available from these bat CoVs to allow more detailed 
phylogenetic analysis. Molecular clock analysis of the complete RdRp gene dated the 
tMRCA of MERS-CoV and Pi-BatCoV HKUS at around 1520, whereas analysis of the N 
gene dated the tMRCA of MERS-CoV, Ty-BatCoV HKU4 and Pi-BatCoV HKU5 at 
around 1324. Using the 904-bp RdRp sequences available from the three European 
strains, the tARCA of MERS-CoV and European bat CoV strains were dated at around 
1859. Our results suggested that Ty-BatCoV HKU4, Pi-BatCoV HKUS5 and MERS-CoV 
have diverged at least centuries ago from their common ancestor. Although MERS-CoV 
and the European bat CoV strains were estimated to have diverged more recently, this is 
unlike the situation in SARS-related CoVs which only diverged between civet and bat 
strains several years before the SARS epidemic (17). Therefore, these bat lineage C 
betaCoVs were unlikely the direct ancestor of MERS-CoV. However, the present analysis 
is limited by the lack of more sequences from potential intermediate virus species/strains 
with widely distributed and well-determined dates, which better reflect the different 
selective pressures over the long period of time as these viruses evolved. Further studies 
on bats and other animals are required to fill the gap between these bat lineage C 
betaCoVs and MERS-CoV during their evolution. Moreover, longer gene or complete 
genome sequence data from these animal viruses would be important for more accurate 
taxonomic and evolutionary studies. 

The divergent sequences of the S genes of Pi-BatCoV HKUS5 may suggest that the 


virus has a better ability to generate variants to occupy new ecological niches. The S 
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373 


proteins of CoVs are responsible for receptor binding and host adaptation, and are 
therefore one of the most variable regions within CoV genomes (16, 18, 28). Studies on 
SARS CoV have shown that changes in its S protein, both within and outside of receptor 
binding domain, could govern CoV cross-species transmission and emergence in new 
host populations (83, 84). We have also previously demonstrated recent interspecies 
transmission of an alphaCoV, BatCoV HKU10, from Leschenault’s rousettes to Pomona 
leaf-nosed bats, and the virus has been rapidly adapting in the new host by changing its S 
protein (59). In this study, Ty-BatCoV HKU4 and Pi-BatCoV HKUS5 were exclusively 
detected in lesser bamboo bat (Tylonycteris pachypus) and Japanese pipistrelle 
(Pipistrellus abramus) respectively. Moreover, the Ka/Ks ratios of the RdRp, S and N 
genes in both viruses were low, supporting that the two bat species were the respective 
primary reservoirs for the two CoVs. Nevertheless, unlike that of Ty-BatCoV HKU4, the 
S gene of Pi-BatCoV HKUS exhibited much higher sequence divergence among different 
strains due to both synonymous and non-synonymous substitutions. Moreover, a much 
higher number of positively selected sites were observed in the S gene of Pi-BatCoV 
HKUS than that of Ty-BatCoV HKU4, with most of the sites under selection being 
distributed within the S1 region which likely contains the RBD. This suggested that the 
S1 region of Pi-BatCoV HKUS may have been under functional constraints in its host 
species, Japanese pipistrelle, which may have favored adaptation to new 
host/environments. 

The marked polymorphisms in the S protein of Pi-BatCoV HKUS5 may reflect the 
biological characteristics of its host species, Japanese pipistrelle, which is a small-size, 


insectivorous bat with body weight 4 to 10 g. It is considered the most common bat 
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species found in urban areas of Hong Kong (85). While it is abundant in wetland areas, 
its roosts are frequently found in towns and villages, as well as various types of buildings 
and other man-made structures, such as fans or air-conditioners. It is also known to utilize 
bat houses or boxes as its roosts. Such diverse habitat and adaptability to harsh 
environments may have favored the mutation of Pi-BatCoV HKU5 especially in its S 
protein which is responsible for receptor binding and immunogenicity. Interestingly, this 
bat species is not only widely distributed in China, Russia, Korea, Japan, Vietnam, 
Burma and India, but also the Kingdom of Saudi Arabia and neighboring countries (42, 
85). Moreover, other Pipistrellus bats including P. arabicus, P. ariel, P. kuhlii, P. 
pipistrellus, P. rueppellii and P. savii have been recorded in the Arabian Peninsula 
(www.iucn.org). In fact, the partial sequences closely related to MERS-CoV detected in 
bats from Europe were also originated from Pipstrellus bats (P. pipistrellus, P. nathusii 
and P. pygmaeus) of the family Vespertilionidae, and those from Ghana were originated 
from Nycteris bats (Nycteris cf. gambiensis) of the related family Nycteridae (44). 
Similarly, the bat betaCoV strain related to MERS-CoV detected in Meixco was 
originated from a N. laticaudatus bat belonging to Molossidae, a closely related family of 
Vespertilionidae (45, 86). The difference between this bat betCoV and MERS-CoV 
within the partial nsp14 sequence was also found to be mainly due to substitutions in the 
3 nucleotide positions, suggesting strong purifying selection (45). However, S gene 
sequences were not available from these bat viruses for further analysis of 
polymorphisms and selective pressures. Nevertheless, based on our existing data, bats 
belonging to Vespertilionidae and related families, especially Pipistrellus bats and those 


with diverse habitats, in the Arabian Peninsula should be intensively sought for potential 
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ancestral viruses of MERS-CoV, which may have evolved through mutations in the S 
gene especially in the RBD, allowing efficient transmission to other animals or human. In 
contrast, lesser bamboo bats, the host species for Ty-BatCoV HKU4 and one of the 
smallest mammals in the world with body weight 3 to 7 g, have much more restricted 
habitats. Though this species also belongs to the family Vespertilionidae, it is remarkably 
adapted to roost inside bamboo stems, and is mainly found in rural areas in Hong Kong 
and various Asian countries (85). This may, in turn, reflect the lower mutation rate 
observed in the S gene of Ty-BatCoV HKU4. 

It remains to be determined if Ty-BatCoV HKU4 and Pi-BatCoV HKUS, as well 
as other lineage C betaCoVs in bats, utilize the same receptor as MERS-CoV. Recent 
studies have shown that MERS-CoV utilizes DPP4 as its functional receptor (47, 79). 
This suggested that these betaCoVs belonging to lineage C may utilize receptor(s) 
different from those of other CoVs. Moreover, expression of bat (P. pipistrellus) DPP4 in 
non-susceptible cells was found to enable infection by MERS-CoV (47), which is in line 
with the ability of the virus to replicate in cell lines from Rousettus, Rhinolophus, 
Pipistrellus, Myotis, and Carollia bats (79). As DPP4 is a evolutionarily conserved 
protein (47), it may also explain the broad species tropism observed in primate, porcine, 
and rabbit cell lines and reflect the zoonotic origin of MERS-CoV (46, 79). However, Ty- 
BatCoV HKU4 and Pi-BatCoV HKUS5, as with other bat CoVs, have not been 
successfully cultured in vitro, which hampers studies on their receptor binding and host 
adaptation. Further discoveries of lineage C betaCoVs in animals and studies on the 
receptors of the different animal counterparts in their respective hosts may help 


understand the mechanism of interspecies transmission and emergence of MERS-CoV. 
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Bats are increasingly recognized as reservoir for various zoonotic viruses 
including SARS CoV, lyssavirus, rabies virus, Hendra, Nipah, Ebola as well as influenza 
virus (87, 88). While the existence of CoVs in bats was unknown before the SARS 
epidemic, it is now known that the different bat populations harbor diverse CoVs, which 
is likely the result of their species diversity, roosting behavior and migrating ability (16, 
18, 29, 31, 32, 89). These warm-blooded flying vertebrates are also ideal hosts to fuel 
CoV recombination and dissemination (5, 27, 59). It remains to be ascertained if bats 
could also be the animal origin for the emergence of MERS-CoV either directly or via an 
intermediate host, the latter as in the case of SARS CoV where the bat ancestral virus 
may have jumped to the intermediate host when bats are in contact or mixed with other 
animals (16). Since history of contact with animals such as camels and goats has been 
reported in MERS-CoV-infected cases (90), the virus may have jumped from bats to 
these animals before infecting humans. Surveillance studies of lineage C betaCoVs from 
bats and other animals in the Middle East may help identify the origin and chain of 


transmission of MERS-CoV. 
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LEGENDS TO FIGURES 

FIG 1 Phylogenetic analysis of RdRp, S and N genes of Ty-BatCoV HKU4 and Pi- 
BatCoV HKUS strains, and those of other betaCoVs with available complete genome 
sequences. The trees were constructed by maximum-likelihood method with bootstrap 
values calculated from 100 trees. 937, 1535, and 546 aa positions in RdRp, S, and N 
genes respectively were included in the analysis. The scale bar indicates the estimated 
number of substitutions per 5 or 20 aa. HCoV-HKU1, human coronavirus HKU1, HCoV- 
OC43, human coronavirus OC43; MHV, murine hepatitis virus; BCoV, bovine 
coronavirus; PHEV, porcine hemagglutinating encephalomyelitis virus; GiCoV, giraffe 
coronavirus; RCoV, rat coronavirus; ECoV, equine coronavirus; RbCoV HKU14, rabbit 
coronavirus HKU14; AntelopeCoV, sable antelope coronavirus; SARS-CoV, SARS 
coronavirus; SARSr-Rh-BatCoV HKU3, SARS-related Rhinolophus bat coronavirus 
HKU3; SARSr-CiCoV, SAR-related civet coronavirus; SARSr CoV CFB, SARS-related 
Chinese ferret badger coronavirus; Ty-BatCoV HKU4, Tylonycteris bat coronavirus 
HKU4; Pi-BatCoV HKUS, Pipistrellus bat coronavirus HKU5; MERS-CoV EMC, 
Middle East Respiratory Syndrome Coronavirus EMC; MERS-CoV England1, Middle 
East Respiratory Syndrome Coronavirus England1; Ro-BatCoV HKU9, Rousettus bat 
coronavirus HKU9. 

FIG 2 Distribution of amino acid changes in the spike protein of Ty-BatCoV HKU4 
(upper panel) and Pi-BatCoV HKUS5 (lower panel). The positions of the amino acid 
changes are depicted by vertical lines. SS, predicted signal peptide; RBD, receptor 
binding domain; HR1, heptad repeat 1; HR2, heptad repeat 2; TM, transmembrane 


domain. 
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FIG 3 Graphical representation of multiple sequence alignment showing the amino acid 
changes in the spike protein of Pi-BatCoV HKUS. The height of symbols indicates the 
relative frequency of each amino acid at the position. Polar amino acids are indicated in 
green; neutral amino acids are indicated in purple; basic amino acids are indicated in blue; 
acidic amino acids are indicated in red; hydrophobic amino acids are indicated in black. 
The figure was generated using WebLogo (91). 

FIG 4 Distribution of positively selected sites in S proteins identified using REL in Ty- 
BatCoV HKU4 (upper panel) and Pi-BatCoV HKU5 (lower panel). Positively selected 
sites with posterior probability greater than 0.5 are shown. 

FIG 5 Estimation of the tMRCA of Ty-BatCoV HKU4 and Pi-BatCoV HKUS. The time- 
scaled phylogeny was summarized from all MCMC phylogenies of the (A) complete 
RdRp, (B) complete N and (C) 904-bp RdRp sequence data set analyzed under the 
relaxed clock model with an exponential distribution (Uced) in BEAST v 1.7.4. Viruses 


characterized in this study are bolded. 
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TABLE 1 Primers used in this study 


Coronaviruses 


Primers 
Forward 


821 
Backward 


Ty-BatCoV HKU4 


922 


RdRp LPW3283 5’°-GTAATGTCTGTCAGTATTGGGTT-3’ LPW3232 5’°-AACTAATATGCTCTTTAACACTTCAC-3’ 
LPW2771 5’°-TGYTAYGCTTTAMGNCAYTTYGA-3’ LPW2773 5’°-GTTGGGTAATAACAAAATCACCAA-3’ 
LPW2626 5’°-GTTTTAACACTYGATAAYGARGA-3’ LPW2630 5’°-AGTATATTGAARTTNGCACARTG-3’ 
LPW2738 5’°-CCACCCTAATTGTGTTAATTGTA-3’ LPW2775 5’-TAACTGAAGACCCTTCCTTGAAA-3’ 
LPW3233 5’°-GGCAATTTTAATAAAGATTTTTATGA-3’ LPW3234 5’°-GCCAAAATCAATGACGCTAAAAT-3’ 
LPW1507 5’°-GGTTGGGACTATCCTAAGTGTGA-3’ LPW1508 5’°-CCATCATCAGATAGAATCATCATA-3’ 
LPW1037 5’°-WTATKTKAARCCWGGTGG-3’ LPW1040 5’°-KYDBWRTTRTARCAMACAAC-3’ 
LPW3235 5’°-CTTAATAAACACTTTTCTATGATGAT-3’ LPW2678 5’-TACTCACCGAGCTGTACTTTACTA-3’ 

S LPW3797 5’°-AGATTTATATAAAATTATGGGAA-3’ LPW4102 5’°-TACGTGGTTTTAATATGCAATAAAA-3’ 
LPW3899 5’-TCTCTTACTAATACATCGGCT-3’ LPW3900 5’°-AAGACCTGACCATCTTCAGAAA-3’ 
LPW4103 5’°-TGGTGCAAACCAAGATGTTGAAA-3" LPW3712 5’°-CTAGCGCTATAACTTCTAAAAGTA-3’ 
LPW3720 5’°-CATTAGTAGTTAGTGATTGTAAA-3’ LPW2821 5’°-GTCATAAAGTGGTGGTAAAACTT-3’ 
LPW2319 5’°-ATTAATGCTAGAGAYCTHMTTTG-3’ LPW2320 5’-TTTGGGTAACTCCAATNCCRTT-3’ 
LPW2824 5’-TTTGCCGCTATACCTTTTGCACAA-3’ LPW4106 5’°-TGAGTTATAGGTTCAGGTTTATAA-3’ 
LPW4105 5’°-TATTAGTGACATCCTTGCTAGGCTT-3’ LPW2317 5’°-GAGCCAAACATACCANGGCCAYTT-3’ 
LPW4107 5’°-ATGGTCCTAACTTTGCAGAGATA-3’ LPW21565 5’-TGCCAGACATGCCACCACAA-3’ 

N LPW21407 5’°-AACGAATCTTAATAACTCATTGTT-3’ LPW21408 5’-CTCTTGTTACTCTTCATTGGCAT-3’ 

Pi-BatCoV HKUS 

RdRp LPW3350 5’-TTTGTCAATTTTGGATAGGACAT-3’ LPW3352 5’°-TGATGCATCACAGCARCCATA-3’ 
LPW3351 5’°-ATCAGAATAACTGTGAAGTGCTT-3’ LPW3275 5’°-GACAATTGGACCAAAAGACGTT-3’ 
LPW3382 5’°-CAAATTGTGTGAACTGTACTGAT-3’ LPW3387 5’°-ATATATCTCGAAGTAACGATCAA-3’ 
LPW3172 5’°-GTCCTGGCAACTTTAATAAAGATT-3" LPW3130 5’°-CTAATATGAGAGATGCAAAGA-3’ 
LPW1507 5’°-GGTTGGGACTATCCTAAGTGTGA-3’ LPW1508 5’°-CCATCATCAGATAGAATCATCATA-3’ 
LPW3384 5’°-CTAAATTTGTGGACAGGTATTAT-3’ LPW3399 5’-CTTCGTATACACGTACCACAA-3’ 

S LPW21416 5’-CTCTTGTCGCAGGGTAAACTT-3’ LPW4284 5’°-AAAGACTCTACCTGTGCAGAATA-3’ 
LPW4086 5’-TAACTTATACTGGACTGTACCCAAA-3’ LPW4193 5’°-AAGCCATTTGAAGGTTACCATT-3’ 
LPW4192 5’°-ACTTTGCTACTTTACCTGTGTAT-3’ LPW4137 5’°-AGTAACACCAAATGTGAAATT-3’ 
LPW4285 5’°-AATCGCCACTCTAAACTTTACTA-3’ LPW4286 5’°-AAGAGGCTGGGTATTCTGGGTT-3’ 
LPW4138 5’°-AAGATGAGTCTATTGCTAATCTAT-3’ LPW4139 5’°-AGCTTCCATATAGGGGTCATA-3’ 
LPW4287 5’-TGTGCACAATATGTTGCTGGCTA-3’ LPW4288 5’-AAAGAACTACCAGTATAATACCAA-3’ 
LPW4140 5’°-AACACTGAGAATCCACCAAA-3’ LPW21417 5’°-CACACGCATCATAAGTTCGTT-3’ 

N LPW21361 5’-GAATCTTATTATCTCATTGTT-3’ LPW21362 5’-CTATTACGTTCAATTGGCAAT-3’ 
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TABLE 2 Detection of Ty-BatCoV HKU4 and Pi-BatCoV HKUS in bats by RT-PCR 


Bats 


Scientific name Common name No. of bats No. (%) of bats positive forCoV No. (%) of bats positive for CoV 
tested in respiratory samples in alimentary samples 
Ty-BatCoV Pi-BatCoV Ty-BatCoV Pi-BatCoV 
HKU4 HKUS5 HKU4 HKUS5 
Megachiroptera 
Pteropodidae 
Cynopterus sphinx Short-nosed fruit bat 26 0 (0) 0 (0) 0 (0) 0 (0) 
Rousettus leschenaulti Leschenault’s rousette 73 0 (0) 0 (0) 0 (0) 0 (0) 
Microchiroptera 
Hipposideridae 
Hipposideros armiger Himalayan leaf-nosed bat 198 0 (0) 0 (0) 0 (0) 0 (0) 
Hipposideros pomona Pomona leaf-nosed bat 642 0 (0) 0 (0) 0 (0) 0 (0) 
Rhinolophidae 
Rhinolophus affinus Intermediate horseshoe bat 359 0 (0) 0 (0) 0 (0) 0 (0) 
Rhinolophus pusillus Least horseshoe bat 89 0 (0) 0 (0) 0 (0) 0 (0) 
Rhinolophus sinicus Chinese horseshoe bat 2012 0 (0) 0 (0) 0 (0) 0 (0) 
Vespertilionidae 
Hypsugo pulveratus Chinese pipistrelle 1 0 (0) 0 (0) 0 (0) 0 (0) 
Miniopterus magnater Greater bent-winged bat 15 0 (0) 0 (0) 0 (0) 0 (0) 
Miniopterus pusillus Lesser bent-winged bat 450 0 (0) 0 (0) 0 (0) 0 (0) 
Miniopterus schreibersii | Common bent-winged bat 758 0 (0) 0 (0) 0 (0) 0 (0) 
Myotis chinensis Chinese myotis 122 0 (0) 0 (0) 0 (0) 0 (0) 
Myotis horsfieldii Horsfield’s Bat 7 0 (0) 0 (0) 0 (0) 0 (0) 
Myotis muricola Whiskered myotis 4 0 (0) 0 (0) 0 (0) 0 (0) 
Myotis ricketti Rickett's big-footed bat 307 0 (0) 0 (0) 0 (0) 0 (0) 
Nyctalus noctula Brown noctule 54 0 (0) 0 (0) 0 (0) 0 (0) 
Pipistrellus abramus Japanese pipistrelle 219 0 (0) 0 (0) 0 (0) 55 (25%) 
Pipistrellus tenuis Least pipistrelle 11 0 (0) 0 (0) 0 (0) 0 (0) 
Scotophilus kuhlii Lesser yellow bat 18 0 (0) 0 (0) 0 (0) 0 (0) 
Tylonycteris pachypus Lesser bamboo bat 115 0 (0) 0 (0) 29 (29%) 0 (0) 
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Tylonycteris robustula Greater bamboo bat 1 00) 0.(0) 0(0) 0 (0) 
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825 TABLE 3 Pairwise amino acid identities between the RdRp, S and N genes of Ty-BatCoV HKU4, Pi-BatCoV HKUS5 and MERS- 
826 CoV to those of other betaCoVs 


Coronaviruses Pairwise amino acid identity (%) 
Ty-BatCoV HKU4 2 Pi-BatCoV HKUS_31 MERS-CoV 
RdRp Ss N RdRp Ss N RdRp Ss N 
Betacoronavirus lineage A 
HCoV-0C43 68.8 33.4 33.2 68.7 31.2 34.2 68.3 32 35.3 
BCoV 68.7 33:5 33:2 68.6 313 34.8 68.2 31.3 35.6 
PHEV 68.8 33.2 32.7 68.7 31.2 33.9 68.3 32.5 35.1 
GiCoV 68.7 33.9 32.8 68.6 31.6 34.8 68.2 314 35.3 
RCoV 68.8 32.4 33.1 68.8 314 34.3 68.7 32 34.8 
RbCoV HKU14 68 33.8 33.2 68 30.9 34.9 68 32.2 35.3 
AntelopeCoV 68.7 33.7 32.8 68.6 31.2 34.8 68.2 314 35.3 
ECoV 69.1 32.4 34.9 68.7 31.5 35.6 68.3 31.6 35.7 
MHV 68.7 32.7 34.1 68.8 31.9 34.7 68.6 31.5 34.3 
HCoV-HKUI 67.6 32.1 32.8 68.1 30.2 33.3 67.9 31.8 32.3 
Betacoronavirus lineage B 
SARS-CoV 71.6 33.6 45.8 718 33.5 43.6 71.9 31.6 46.6 
SARSr-Rh-BatCoV HKU3 77 33.6 45.2 TAT. 32.8 43.9 718 30.6 46.2 
Betacoronavirus lineage C 
Ty-BatCoV HKU4 99.5-100 97.3-99.6 99.5-100 92-92.5 67.7-68.1 73.5-14 89.6-90 66.8-67.4 — 71.9-72.3 
Pi-BatCoV HKUS 92.1-92.4 — 67.5-68.4 73.7-75.1 99.4-99.7  88.3-97 97.2-98.6 92.1-92.3  63.4-64.5 69.5-70.5 
MERS-CoV 89.9 67.3-67.4 71.6-72.1 92.1 64.3 68.8-69.5 - - - 
Betacoronavirus lineage D 
Ro-BatCoV HKU9 69.3 30.8 37.3 68.7 31 36.9 68.4 30.3 37.8 


827 


43 


828 
829 
830 


831 


TABLE 4 Estimation of non-synonymous and synonymous substitution rates in the 
RdRp, S and N genes of Ty-BatCoV HKU4, Pi-BatCoV HKUS5 and MERS-CoV 


Gene 


Ty-BatCoV HKU4 
(18 strains) 


Pi-BatCoV HKU5 MERS-CoV 
(19 strains) (2 strains) 


Ka Ks Ka/Ks 


RdRp 0.001 0.033 0.03 


S 
N 


0.004 0.034 0.118 
0.001 0.019 0.053 


Ka Ks Ka/Ks Ka Ks Ka/Ks 
0.001 0.128 0.0078 0 0.006 0 
0.038 0.321 0.118 0.001 0.008 0.125 
0.005 0.095 0.053 0.002 0.010 0.2 


Poor MHV 100 -—— MHV s47-77 HCoV-HKUI1 


RdRp J 'RCoV Ss tw) “RCoV N [or MHV 
100) (— HCoV-HKU1 -—— HCoV-HKUI 100 RCoV 
— RbCoV HKU14 Li 4 —ECoV é ECoV Lineage 
100} ECoV ineage | — PHEV Lineage HCoV-0C43 
PHEV ie oalfRbCoV HKU14 87) PHEV A 
°"_ HCoV-OC43 op HCoV-OC43 A 94-RbCoV HKU14 
BCoV “ BCoV GiCoV 
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